Protein classification using modified n-grams and skip-grams

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Protein classification using modified n-grams and skip-grams.

Motivation Classification by supervised machine learning greatly facilitates the annotation of protein characteristics from their primary sequence. However, the feature generation step in this process requires detailed knowledge of attributes used to classify the proteins. Lack of this knowledge risks the selection of irrelevant features, resulting in a faulty model. In this study, we introduce...

متن کامل

Comparing Medline citations using modified N-grams

OBJECTIVE We aim to identify duplicate pairs of Medline citations, particularly when the documents are not identical but contain similar information. MATERIALS AND METHODS Duplicate pairs of citations are identified by comparing word n-grams in pairs of documents. N-grams are modified using two approaches which take account of the fact that the document may have been altered. These are: (1) d...

متن کامل

Modeling Harmony with Skip-Grams

String-based (or viewpoint) models of tonal harmony often struggle with data sparsity in pattern discovery and prediction tasks, particularly when modeling composite events like triads and seventh chords, since the number of distinct n-note combinations in polyphonic textures is potentially enormous. To address this problem, this study examines the efficacy of skip-grams in music research, an a...

متن کامل

Skip N-grams and Ranking Functions for Predicting Script Events

In this paper, we extend current state-of-theart research on unsupervised acquisition of scripts, that is, stereotypical and frequently observed sequences of events. We design, evaluate and compare different methods for constructing models for script event prediction: given a partial chain of events in a script, predict other events that are likely to belong to the script. Our work aims to answ...

متن کامل

Biological Named Entity Recognition Using n-grams and Classification Methods

We propose a biological named entity recognition system which uses classification methods and a n-gram model to annotate terms in text. A novel method is presented to express lexical features in a pattern notation. Prefix and suffix characters are used instead of lists of potential terms or other external resources. Creating classification exemplars is conducted from text by using a word n-gram...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Bioinformatics

سال: 2017

ISSN: 1367-4803,1460-2059

DOI: 10.1093/bioinformatics/btx823